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Abstract 

Univariate polynomial root-finding is a classical subject, still important for modern comput¬ 
ing. Frequently one seeks just the real roots of a polynomial with real coefficients. They can be 
approximated at a low computational cost if the polynomial has no nonreal roots, but for high 
degree polynomials, nonreal roots are typically much more numerous than the real ones. The 
challenge is known for a long time, and the subject has been intensively studied. Nevertheless, 
we produce some novel ideas and techniques and obtain dramatic acceleration of the known al¬ 
gorithms. In order to achieve our progress we exploit the correlation between the computations 
with matrices and polynomials, randomized matrix computations, and complex plane geometry, 
extend the techniques of the matrix sign iterations, and use the structure of the companion 
matrix of the input polynomial. The results of our extensive tests with benchmark polynomials 
and random matrices are quite encouraging. In particular in our tests the number of iterations 
required for convergence of our algorithms grew very slowly (if at all) as we increased the degree 
of the univariate input polynomials and the dimension of the input matrices from 64 to 1024. 

Keywords: Polynomials, Real roots. Matrices, Matrix sign iterations. Companion matrix, Frobe- 
nius algebra. Square root iterations. Root squaring 


1 Introduction 

Assume a univariate polynomial of degree n with real coefficients, 

n n 

=PnW_{x- Xj), Pn^O, (1.1) 

i=0 j=l 

which has r real roots xi,... ^Xr and s = {n — r)/2 pairs of nonreal complex conjugate roots. In 
some applications, e.g., to algebraic and geometric optimization, one seeks only the r real roots, 
which make up just a small fraction of all roots. This is a well studied subject (see [13 Section 
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10.3.5], [3n]7 US]? and the bibliography therein), but the most popular packages of subroutines for 
root-finding such as MPSolve 2000 [S], Eigensolve 2001 [T7], and MPSolve 2012 [10] approximate 
the r real roots about as fast and as slow as all the n complex roots. It can be surprising, but we 
present some novel methods that accelerate the solution by a factor of u/r, which means dramatic 
speed up in the cited important applications. 

The springboard for our real root-finders is the matrix sign iterations, which we apply to the 
companion matrix of an input polynomial. It is a well known technique for matrix computations 
[20] . and we made it particularly efficient for real polynomial root-finding, although it has never 
been used for this purpose so far. We combine it with various old and new techniques that support 
fast convergence of the iterations and their numerical validity. 

Our tests for benchmark polynomials confirm the efficiency of this approach. In particular, 
the number of iterations was typically quite small and grew very slowly (if it grew at all) as the 
polynomial degree increased from 64 to 1024. 

Some of our techniques should be of independent interest, e.g., our numerical stabilization in 
Section [TS] our exploitation of matrix functions and randomized matrix computations in Algorithm 
l3Tl and the combination of our maps of the complex plane with rational transformations of the 
variable and the roots. Some of our algorithms (e.g., the ones of Section Isa combine operations 
with matrices and polynomials, demonstrating once again the value of synergistic combinations of 
this kind, which we have been advocating since [55] and [5]. 

Our goal in this paper is to present a novel approach to real root-finding for a polynomial and 
to demonstrate the promise of this approach by performing some preliminary tests. We hope that 
we have advanced toward our goals substantially, but the implementation of our algorithms at the 
competitive level is our next project, which requires substantial further effort. For example, Stage 
3 of our Algorithm 3.1 is reduced to the inversion or orthogonalizaton of Toeplitz-like matrices, and 
the customary numerical algorithms, currently available for these operations, can be dramatically 
accelerated by extending the techniques of the papers [54] and [55] . 

We organize our paper as follows. In the next section we cover some background material. We 
present a variety of our real polynomial root-finders in Section [3] In Section [4] (the contribution of 
the second author) we present the results of our numerical tests. In the Appendix we cover some 
auxiliary results. 


2 Basic Definitions and Results 

Hereafter “flop” stands for “floating point arithmetic operation”, assumed to be performed numeri¬ 
cally, with bounded precision, e.g., the standard IEEE double precision. 

2.1 Some Basic Definitions for Matrix Computations 

• denotes the linear spaces of complex m x n matrices. is its subspace of m x n 

real matrices. 

• M'^ = (mji)™’2i transpose of a matrix M = S j^h jg j^g Uermitian 

transpose. for a real matrix M. 

• ||M|| = ||M ||2 denotes its spectral norm. 

• I = In is the n X n identity matrix. 

• diag(6_,)J_j^ = diag(6i,..., 6^,) is the s x s diagonal matrix with the diagonal entries 6i, ..., bg. 

• TZ{M) is the range of a matrix M, that is, the linear space generated by its columns. 

• A matrix of full column rank is a matrix basis of its range. 

• A matrix Q is orthogonal or unitary if Q^Q = / or QQ^ = /. 
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• Suppose an TO X n matrix M has rank n (and so to > n). Write {Q,R) = {Q{M),R{M)) to 
denote a unique pair of unitary m x n matrix Q and upper triangular n x n matrix R such 
that M = QR and all diagonal entries of the matrix R are positive [HI Theorem 5.2.3]. 

• M+ is the unique Moore-Penrose pseudo inverse of M [H Section 5.5.2], equal to if and 
only if the matrix M is unitary. 

• An m X n matrix M has an n x to left inverse matrix X = such that XM = In ii and 

only if it has full column rank n. In this case M'*' is a left inverse. The left inverse is unique 
if and only if M is a nonsingular matrix, in which case m = n and = M~^. 

• The e-rank of a matrix M is the minimal rank of the matrices in its e-neighborhood. Numerical 
rank nrank(M) is the e-rank where e is small in context. 

Definition 2.1. Eigenvalues, eigenvectors and eigenspaces. 

• A scalar x is an eigenvalue of a matrix M associated with an eigenvector v if Mv = x'v. 

• The eigenvectors associated with an eigenvalue x or, more generally, with any set of the eigen¬ 

values X G X(M) form the eigenspaces S{M,x) and S{M,X), respectively, associated with 
the eigenvalue x and the set X of eigenvalues, respectively. A linear subspace S is an 

eigenspace of a matrix M if and only if MS = {Mv : v G 5} C 5 (see \45[ Definition 4-Tl])- 

• An eigenvalue x of a matrix M is a root of the characteristic polynomial det{xl — M). The 
multiplicity of this root is the algebraic multiplicity of the eigenvalue x, denoted am(x). The 
dimension gm{x) = dim(5(M, a;)) is the geometric multiplicity of x, never exceeding am{x). 
An eigenvalue x is simple if gm{x) = 1. 

2.2 The Companion Matrix and the Frobenius Algebra 


ej = (0,0 

..., 0,1) denote 

the 
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Z is the down-shift matrix. Cp is the companion matrix of the polynomial p{x) of (11.11) , which is 
the characteristic polynomial of this matrix. Hence real root-finding for the polynomial p{x) turns 
into real eigen-solving for this matrix. 

Z\ = for a vector v = (ui)iLi and for uq = 0. 

Theorem 2.1. (The Cost of Computations in the Frobenius Matrix Algebra.j The companion 
matrix Cp G of a polynomial p{x) of il.ll) generates the Frobenius matrix algebra. 

(i) One needs 0{n) flops for addition, 0{n\og{n)) for multiplication, and 0(nlog^(n)) for in¬ 
version in this algebra. One needs 0{nlog{n)) flops to multiply a matrix in this algebra by a vector. 

(ii) Numerically stable inversion of a matrix in the Frobenius matrix algebra can be performed 
by using 0(nlog(n)) flops. 

Proof. The estimates of part (i) can be found in m and [32 • Part (ii) has been supported by the 
algorithms of [54] accelerated by a factor of log(n) in [36l Section 9.8]. □ 
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2.3 Decreasing the Size of an Eigenproblem 

An eigenvalue x of a matrix M as well as a set of eigenvalues X are dominant if they are absolutely 
larger than all the other eigenvalues. An eigenspace is called dominant if it is associated with a 
dominant eigenvalue or a dominant set of eigenvalues. 

The set X{M) of all eigenvalues of a matrix M is called its spectrum. 

The Power Method [TH Section 7.3.1] computes the vector M^v, for a random vector v and 
a sufficiently large integer k. The 1-dimensional vector space for t G C, is expected to 

approximate the eigenspace associated with an eigenvalue x if it is dominant and simple. This would 
not work only if the vector v has an absolutely small component along the eigenvector associated 
with this eigenvalue x, but such an event is unlikely, for a random vector v. One can choose fc = 1 if 
the domination of the eigenvalue x in the spectrum of M is strong. Let us extend the Power Method 
for fc = 1 to the approximation of a strongly dominant eigenspace of a dimension r. 

Algorithm 2.1. Approximation of the dominant eigenspace. 

Input: an n x n matrix M, the dimension r of its dominant eigenspace Id, 0 < r < n, and two 
tolerance bounds: a positive integer K and a positive e. 

Output: FAILURE (with a low probability) or a unitary matrix U whose range approximates the 
eigenspace U. 

Computations: 

1. Apply the randomized algorithm of m, which at first generates a standard Gaussian random 
n X r+ matrix G for a proper integer > r and then computes the matrix H = MG and the 
numerical rank nrank(i7). 

2. Unless nrank(i7) = r, re-apply the algorithm of \19f up to K times until the equation nrank(i7) = 
r is observed. If it is never observed, output FAILURE (this occurs with a probability near 0). 

3. //nrank(iJ) = r, then compute the QR factorization H = Q{H)R{H), output annxr unitary 
matrix U approximating the first r columns of the matrix Q{H), and stop. (The analysis in 
m Section 4], !37l Section 7.)], and 142[ Corollary 2.1] shows that, with a probability close 
to 1, the columns of the matrix U closely approximate a unitary basis of the eigenspace U and 
that ||M— UU^M\ \ < e||M||. The latter bound would certify correctness of the output.) 

The arithmetic cost of performing the algorithm is 0(n^r_|_), but decreases to 0(nr+(r_|_-|-log(n)), 
for M = Gp, by virtue of Theorem 12.11 It increases by a factor of log(r) if the dimension r 
of the eigenspace U is not available, but is computed by using binary search that begins with 
recursive doubling of the candidate integer values 1, 2, 4, etc. The algorithm generates nrp random 
parameters, but its modification using the structured (so called SRFT) multipliers G involves only n 
such parameters and only 0(nlog(n)) flops for the computation of the product MG (see [HI Section 
11] and |37l Section 7.5]). Alternative application of the orthogonal iterations of [m Section 7.3.2] 
requires order of n^r flops. 

Remark 2.1. Actually the algorithm of m works even where the input includes an upper bound 
r_|_ on the dimension r of the dominant eigenspace U, rather than the dimension itself, and then the 
algorithm can compute this dimension r within the above computational cost as by-product. (The 
integer r = nrank(iJ) can be obtained, e.g., from rank revealing QR factorization of the matrix H.) 

Now suppose that we have an eigenspace generated by r eigenvalues of an n x n matrix. Then the 
following simple theorem (extending the recipe of the Rayleigh quotients) enables us to approximate 
these eigenvalues as the eigenvalues of an auxiliary r x r matrix. 

Theorem 2.2. (Decreasing the Eigenproblem Size to the Dimension of an Eigenspace, cf. \52[ 
Section 2.1].) 

Suppose that M G U G and the matrix U has full column rank r < n and generates 

the space U — IZ{U). Then 
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(i) U is an eigenspace of M if and only if there exists a matrix L G such that MU = UL 

or equivalently if and only if L = U^^'^MU, 

(ii) X{L) C X{M), 

(Hi) MUv = xUv if Lv = xv, 

(iv) the matrix L is unique, that is, its choice is independent of the choice of a matrix U and its 
left inverse and so L = U^MU, for a unitary matrix U. 

The algorithm and the theorem enable us to approximate the r real eigenvalues of a matrix as 
the r dominant eigenvalues of an auxiliary matrix. Theorems 12.21 and 12.31 (below) together suggest 
a direction to such a reduction, and we achieve it in Sections o and 13.31 

2.4 Matrix Functions and Eigenspaces 

Theorem 2.3. (The Eigenproblems for a Matrix and Its Function.J 

Suppose that M is a square matrix and that a rational function f{x) is defined on its spectrum. 

(i) Then f{M)v = f{x)v if Mv = xv. 

(ii) Let lA = denote the eigenspace of the matrix f{M) associated with its eigenvalue pL. 
Then this is an eigenspace of the matrix M associated with all its eigenvalues x such that f{x) = pi. 

(Hi) The space U has dimension 1 and is associated with a single eigenvalue of M if pi is a simple 
eigenvalue of f{M). 

Proof. We readily verify part (i), which implies parts (ii) and (iii). □ 

Remark 2.2. The matrix , for 1 < fc < n, has the single eigenvalue 0 satisfying am{0) = n and 
gm{0) = k, and so dim(Wo,/) = k, for M = Z, f{x) = x^, and k = 1,... ,n. 

Suppose that we have computed a matrix basis U € for an eigenspace W of a matrix 

function f{M) of an n x n matrix M. By virtue of Theorem l2.31 this is a matrix basis of an eigenspace 
of the matrix M. We can first compute a left inverse or the orthogonalization Q = Q{U) and 
then approximate the eigenvalues of M associated with this eigenspace as the eigenvalues of the 
r_|_ X r+ matrix L = MQ (cf. Theorem 12.21) . 

If r = 1, then the matrix U turns into an eigenvector u, shared by the matrices f{M) and M, 
while the matrix L turns into the the Rayleigh Quotient or the simple quotient {M\i)i/ui, 

for any i such that ut 0. 

2.5 Some Maps in the Probenius Matrix Algebra 

Part (i) of Theorem 12.31 implies that, for a polynomial p{x) of (II.II) and a rational function f{x) de¬ 
fined on the set {xi});^^ of its roots, the rational matrix function f{Cp) has the spectrum X{f{Cp)) = 
{/(xdliLi- In particular, the maps 

, „ Cp + C-^ Cp-C-^ 

Cp ^ C-i, Cp ^ aCp + hi, Cp ^ Cl, Cp ^ 2 ^ ^ ^ ^ 

induce the maps of the eigenvalues of the matrix Cp, and thus induce the maps of the roots of its 
characteristic polynomial p{x) given by the equations 

y = I/x, y = ax + b, y = x"^, y = 0.5(a:: -I- I/a;), and y = 0.5(a; — I/a;), 

respectively. The latter two maps can be only applied if the matrix Cp is nonsingular, so that x ^ 0, 
and similarly for the two dual maps below. 

By using the reduction modulo p{x), define the hve dual maps 

2 /= (I/a;) modp(a;), y = ax + b modp(a;), y = x^ modp(x), 
y = 0.5(x-|-1/a;) modp(x), and ?/= 0.5(a; — 1/x) modp(a;), 

where y = y{x) denotes polynomials. Apply the two latter maps recursively, to define two iterations 
with polynomials modulo p{x) as follows, yo = x, yh+i — 0.5{yh + ^/yn) mod p(x) and yo = 
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cc, yh +1 = 0.5(j//i — l/vh) inodp(a;), h = 0,1,.... More generally, define the iteration yo = x, 
Uh+i = cLVh + i>lyh rnod p{x), h = 0,1,..., for any pair of scalars a and 6, provided that yh = 0, for 
none h. 

3 Real Root-finders with Modified Matrix Sign Iterations. 
Variations and Extensions 

In this section we present some efficient numerical real root-finders based on modification of the 
matrix sign iterations applied to the companion matrix of the input polynomial. 

3.1 A Modified Matrix Sign Iterations 

Our first algorithm approximates the real roots of a polynomial p{x) as the real eigenvalues of the 
companion matrix Cp. It applies the matrix iterations 

Mo = Cp, Mh+i = OMMh - M^-i), for /i = 0,1,..., (3.1) 

which modify the matrix sign iterations M^+i = 0.5(M^ -|- (cf. PD]1. 

For every eigenvalue Xj of the matrix Mq = Cp, define its trajectory made up of the eigenvalues 
of the matrices M^, being its images in the maps Mq —>■ M^, for h = 1, 2,3,.... More generally the 
modified Mobius iterations below define a trajectory initiated at any complex point x^^\ 

Hereafter we write sign( 2 :) =sign(Ji( 2 ;)), for a complex number z (cf. [20l page 107]). 

Theorem 3.1. (Convergence of the modified Mobius Iterations, j Fix a complex x = and define 
the modified Mobius iterations 

- l/x^'^)), for = 0,1,.... (3.2) 

(i) The values are real, for all h, if x^°'> is real. 

(ii) - sign(a;)\/^| < for K = and h = 0,1,.... 

Proof. Part (i) is immediately verified. Part (ii) readily extends the similar estimate on m page 
500]. □ 

Theorem 13.11 implies the following result. 

Corollary 3.1. As h —>■ oo, the trajectories of the 2s nonreal eigenvalues of Mq = Cp converge to 
±\/—1 with the quadratic rate of convergence right from the start, whereas the trajectories of the r 
real eigenvalues are real, for all h. 

Algorithm 3.1. Matrix sign iterations modified for real eigen-solving. 

Input: two integers n and r, 0 < r < n, and the coefficients of a polynomial p{x) of equation U.l\) . 

Output: approximations to the real roots xi,...,Xr of the polynomial p{x) or FAILURE with a 
probability close to 0. 

Computations: 

1. Write Mq = Cp and recursively compute the matrices Mh+i of (3J]), for h = 0,1,... (cf 
Corollary 

2. Fix a sufficiently large integer k and compute the matrix M = M^ -|- /„. 

(By extending Corollary ] 3. 1\ observe that the map Mq = Cp ^ M sends all nonreal eigenvalues 
of Cp into a small neighborhood of the origin 0 and sends all real eigenvalues of Cp into the 
ray {x : x>T}.) 
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3. Apply our randomized Alaorithm \2.1\ in order to approximate a unitary matrix U whose columns 
form a basis for the eigenspace associated with the r dominant eigenvalues of the matrix M. 

(By virtue of Theorem \2.3l this is expected to be the eigenspace associated with the real eigenval¬ 
ues of the matrix Cp, although with a probability close to 0 the algorithm can output FAILURE, 
in which case we stop the computations.) 

4-. Compute and output approximations to the r eigenvalues of the r x r matrix L = CpU. 

(They approximate the r real eigenvalues of the matrix Cp by virtue of Theorem 12.21 and 
conseguently approximate the r real roots of the polynomial p{x).) 

Stages 1 and 2 involve 0(fcnlog^(n)) flops by virtue of Theorem 12.II Stage 3 adds 0{n^r) flops 
and the cost a™ of generating n x r standard Gaussian random matrix. Add 0{r^) flops performed 
at Stage 4 and arrive at the overall arithmetic cost bound 0{{knlog^{n) + nr^) + a™. 

Remark 3.1. (Gounting Real Eigenvalues.^ The binary search can produce the number of real 
eigenvalues as the numerical rank of the matrices + I when this rank stabilizes as k increases. 
As the number of real roots increases, so does the size of the matrix L. This has consistently 
implied the decrease of the accuracy of the output approximations in our tests (see the test results in 
Section\^. One can refine these approximations by applying the inverse Power Method or Newton’s 
iterations, but if the accuracy becomes too low, one must extend the precision of computing. 

Remark 3.2. (A.cceleration by Means of Scaling.^ One can dramatically accelerate the initial 
convergence of Alaorithm \3.1\ by applying determinantal scaling (cf. f20f}. that is, by replacing the 
matrix Mg = Cp by the matrix Mg = 0.5{i'Cp — {h'Cp)~^), for v = 1/| det(Cp)|^/" = \pn/po\- 

Remark 3.3. Real and Nearly Real Roots. 

In the presence of rounding errors Alaorithm \3.1\ and all our other algorithms approximate both 
r real and r.\. — r nearly real eigenvalues of the matrix M, for some r_|_ > r. These eigenvalues, 
however, are the roots of p{x) and, under some mild assumptions about the isolation of every such 
a root from the n — 1 other roots, we can refine their approximations very fast (cf. Theorems \B.1\ 
and m)- Then we can readily select the r real eigenvalues among the r-|_ real and nearly real ones. 

Generally, however, the distinction between real and nearly real roots is very slim in our numerical 
algorithms. As this was pointed out in m, in the course of performing the iterations, the real 
eigenvalues can become nonreal, due to rounding errors, and then would converge to ±\/—1. In 
our extensive tests we never observed such a phenomenon, apparently because in these tests the 
convergence to was much slower for the nearly real eigenvalues than for the eigenvalues with 

reasonably large imaginary parts. 

3.2 Inversion-free Variations of the Modified Matrix Sign Iterations and 
Hybrid Algorithms 

The overall arithmetic cost of the Modified Matrix Sign Iterations is dominated by the cost of k 
matrix inversions, that is, 0(fcnlog^(n)) flops (cf. Theorem 12.11) . If all nonreal eigenvalues of the 
matrix Mg lie in the two discs D{±y/—1, 1/2) = {x : \x ± y/—l\ < 1/2}, then we can avoid this 
deflciency by replacing iterations m with any of the two iteration processes 

Mh+i = 0.5(M/ + 3Mh) (3.3) 

and 

Mh+i = -0.125(3M^ + lOM^ + 15M,,), (3.4) 

for h = 0,1,.... In this case, right from the start both iterations send the nonreal roots toward the 
two points ±-v/—1 with quadratic and cubic convergence rates, respectively. (In order to prove this, 
extend the proof of [7l Proposition 4.1].) Both iteration processes keep the real roots real and use 
0(nlog(n)) flops per iteration. 

What if the nonreal roots do not lie in these discs? We can apply the following combination of 
iterations (I3.1I) - (I3.4I) and Corollary [DT] of Section ID] 
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Algorithm 3.2. A Hybrid Algorithm. 

Input, Output as in Alaorithm \‘J.l\ 

Computations: Perform the iterations of Alaorithm lS. R by using matrix inversions and involving 
0(nlog^(n)) flops per iteration, until a test shows that the 2s nonreal eigenvalues of the input 
companion matrix are mapped into the discs D{zLy/—l, 1/2). (For testing this condition, apply 
the algorithm that supports Corollary \D.1\ . To keep the computational cost down, apply this test 
periodically, according to a fixed policy, based on heuristic rules or the statistics of the previous 
tests.) Then shift the computations to the inversion-free iterations i3.3\} or converging faster 

and using 0{n\og{n)) flops per iteration. 

Let us specify some alternative matrix iterations for real root-finding without inversions. Recall 
that sign(M) = and apply the Newton-Schultz iterations for the approximation of the 

matrix square root uni equation (6.3)], 

Ffc+i = 0.5 yfc(3/ - ZkYk), Yo = M-ft 

and 

^fc+i = 0.5 (3/ — Zq = /, 

for k = 0,1,_ The iterations keep real eigenvalues real and converge if 11/ — M“^| |p < 1, for 

p = 1, 2, or 00 . This assumption is easy to satisfy by means of scaling M —>■ aM, which keeps real 
eigenvalues real, for real a. 

The similar coupling technique of [3] is even simpler, because it is applied directly to the modified 
matrix sign iterations (EU, preserving its quadratic convergence to ±\/—l right from the start. 

In our tests for numerical real root-finding, however, we could perform safely only a small number 
of these inversion-free iterations at the initial stage, and then the images of the real eigenvalues of 
the matrix Cp grew very large and the condition numbers of the computed matrices blew up. 

3.3 Numerical Stabilization of the Modified Matrix Sign Iterations 

The images of nonreal eigenvalues of the matrix Cp converge to ±\/—l in the iterations of Stage 1 
of Algorithm 13.11 but if the images of some real eigenvalues of Cp come close to 0, then at the next 
step we would have to invert an ill conditioned matrix Mh unless we are applying an inversion-free 
variant of the iterations of the previous subsection. 

We can try to avoid this problem by shifting the matrix (and its eigenvalues), that is, by adding 
to the current matrix Mh the matrix si, for a reasonably small positive scalar s or —s. We can select 
this scalar by applying heuristic methods or randomization. In our tests this policy has preserved 
convergence quite well, but towards a more radical recipe, we applied the following modification of 
Algorithm 13.11 which immediately involves nonreal values even when the matrix Cp was real. 

Algorithm 3.3. Numerical stabilization of the modified matrix sign iterations. 

Input, Output and Stages 3 and 4 of Computations are as in Algorithm \ 3.1\. except that the 
input includes a small positive scalar a such that no eigenvalues of the matrix Cp have imaginary 
parts close to ±a-v/^ (see Remark \3.4\ below), the set of r real roots xi,... ,Xr of the polynomial 
p{x) is replaced by the set of its r+ roots having the imaginary parts in the range [—a, a], and the 
integer r is replaced by the integer r+ throughout. 

Computations: 

1. Apply Stage 1 of Alaorithm \3.1\ to the two matrices Mq^± = a\/—l I ±Cp, thus producing two 
sequences of the matrices Mh,+ and Mh,-, for h — 0,1,.... 

2. Fix a suffciently large integer k and compute the matrix M = Mk,+ -b Mk,-. 

Because of the assumed choice of a, the matrices a\/^M I ±Cp have no real eigenvalues, and so 
the images of all their eigenvalues, that is, the eigenvalues of the matrices Mk,+ and Mk,-, converge 
to ±V—1 as fc —)> oo. Moreover, one can verify that the eigenvalues of the matrix Mk,+ -b Mk,- 




converge to 0 unless they are the images of the r+ eigenvalues of the matrix Cp having the imaginary 
parts in the range [—a, a]. The latter eigenvalues of the matrix Mk^+ + Mk^- converge to 2-^—1. 
This shows correctness and numerical stability of Algorithm 13.31 

The algorithm approximates the r+ roots of p{x) by using 0{kn\og^{n) + nr^) + ar^n flops, 
versus 0{kn\og^{n) + nr'^) + a™ involved in Algorithm 13.II 

Remark 3.4. We can test the proximity of the roots to a line in two stages: by at first moving 
the line into the unit circle {x : |a;| = 1} (cf. Theorem \A.^) and then applying the algorithm that 
supports Corollarv \D.l[ 

3.4 Square Root Iterations (a Modified Modular Version) 

Next we describe a dual polynomial version of Algorithm 13.II It extends the square root iterations 
yh+i = \{yh -k^/Uh), h = 0,1,..., and at Stage 2 involves the computation of the polynomial 
agcd(p, tk), which denotes an approximate greatest common divisor of the input polynomial p = p(x) 
and an auxiliary polynomial t^ = t)^{x). We refer the reader to [3T], [52], |1], [SS], [TT], and [27] for 
the definitions of this concept and the algorithms for its computation. 

Compared to Algorithm 13.11 we replace all rational functions in the matrix Cp by the same 
rational functions in the variable x and reduce them modulo the input polynomial p{x). The 
reduction does not affect the values of the functions at the roots of p{x), and it follows that these 
values are precisely the eigenvalues of the rational matrix functions computed in Algorithm 13.11 

Algorithm 3.4. Square root modular iterations modified for real root-finding. 

Input: two integers n and r, 0 < r < n, and the coefficients of a polynomial p{x) of eguation iff.111 . 
Output: approximations to the real roots xi,... ,Xr of the polynomial p(x). 

Computations: 

1. (Cf. iTO]! .! Write yo = x and compute the polynomials 

yh+i = ^iyh-i/yh) mod p(a:), h = 0, l,.... (3.5) 

2. Periodically, for some selected integers k, compute the polynomials tk = y^ + 1 mod p(x). 

3. Write gk{x) = agcd(p, tfe) and compute dk = deg{gk(x)). If dk = n — r = 2s, compute the 
polynomial Vk ~p{x)/gk{x) of degree r. Otherwise continue the iterations of Stage 1. 

4- Apply one of the algorithms of and m (cf. Theorem \C.l)) to approximate the r roots 

yi,... ,yr of the polynomial Vk■ Output these approximations. 

Our comments preceding this algorithm show that the values of the polynomials tk{x) at the 
roots of p(x) are equal to the images of the eigenvalues of the matrix Cp in Algorithm 13.11 Hence 
the values of the polynomials tk{x) at the nonreal roots of p{x) converge to 0 as fc —^ oo, whereas 
their values at the real roots of p{x) stay far from 0. Therefore, for sufficiently large integers k, 
agcd(p, tfc) turns into the polynomial nj=r+i(^ ~ This implies correctness of the algorithm. 

Its asymptotic computational cost is 0{knlog^{n)) plus the cost of computing agcd(p, tfe) and 
choosing the integer k (see our next remark). 

Remark 3.5. Compared to Alaorithm \3.1\. the latter algorithm reduces real root-finding essentially 
to the computation of agcdfp, tk). One can apply quite efficient heuristic algorithms for this com¬ 
putation (cf. \3Vf . m, ® m, un, and jjm), but no good formal estimates are available for 
their complexity. One can, however, note that p{x)uk{x) « tk{x)vk(x), and so, assuming that Vk{x) 
is a monic polynomial (otherwise we can scale it), can obtain its other coefficients (as well as the 
coefficients of the polynomial Uk{x)) from the least-squares solution to the associated Sylvester lin¬ 
ear system of equations. Its well known superfast divide and conquer solution involves order of 
nlog^(n) arithmetic operations (cf. \29[ Chapter 5]), but the recent numerically stable algorithm of 
M accelerated by a factor of login) in J361 Section 9.8] involves only 0{n\og{n)) flops. 


9 







4 Numerical Tests 


Extensive numerical tests of the algorithms of these paper, performed in the Graduate Center of 
the City University of New York, are the contribution of the second author (at some points he 
was assisted by Ivan Retamoso). The tests recorded the number of iterations and the error of the 
approximation of the real roots of benchmark polynomials to which we applied these algorithms. We 
have recorded similar data also for the approximation of real eigenvalues of some random matrices M 
by means of applying Algorithms 13.11 and 13.31 In the latter case the convergence of these algorithms 
and the number of their iterations depended mostly on the characteristic polynomials of M, even 
though the estimates for the arithmetic cost of performing each iteration generally grew compared 
to the special case where M = Cp. 

In some cases we stopped the iterations already when they produced crude approximation to 
the roots. This is because, instead of continuing the iterations, we can apply the algorithms of [JD] 
followed by Newton’s or Ehrlich-Aberth’s iterations (cf. Section |B]), which refine very fast these 
crude approximations. 

Finally we note that the test results in the present section are quite encouraging (in spite of our 
caveat in Remark l3.1L e.g., the numbers of iterations required for convergence of our algorithms have 
grown very slowly (if at all) when we increased the degree of the input polynomials and dimension 
of the input matrices from 64 to 1024. 

The implementation is available upon request. 

4.1 Tests for the Modified Matrix Sign Iterations (Algorithm 13.1|) 

In the first series of the tests. Algorithm 13.11 has been applied to one of the Mignotte benchmark 
polynomials, namely p{x) = x" + (lOOx — 1)^. It is known that this polynomial has three ill 
conditioned roots clustered about 0.01 and has n — 3 well conditioned roots. In the tests. Algorithm 
13.II has output the roots within the error less than 10by using 9 iterations, for n = 32 and n = 64 
and by using 11 iterations, for n = 128 and n = 256. 

In the second series of the tests, polynomials p{x) of degree n = 50,100,150, 200, and 250 have 
been generated as the products p(x) = fi{x)f 2 ix), for the rth degree Chebyshev polynomial fi{x) 
(having r real roots), r = 8,12,16, and f 2 {x) = i.i.d. standard Gaussian 

random variables, for j = 0,..., n — r. Algorithm ITT] (performed with double precision) was applied 
to 100 such polynomials p(x), for each pair of n and r. Table |4T] displays the output data, namely, 
the average values and the standard deviation of the numbers of iterations and of the maximum 
difference between the output values of the roots and their values produced by MATLAB root-finding 
function ”roots()”. 

In the third series of the tests, Algorithm 13.11 approximated the real eigenvalues xi,.. .,Xr of 
a random complex symmetric matrix A = U'^'EU, for E = diag{xi, ... ,Xr,yi, ■■■ ,yn-r), r i.i.d. 
standard Gaussian real random variables xi, ... ,Xr, n — r i.i.d. standard Gaussian complex (non- 
real) random variables yi, ..., yn-r, and a n x n standard Gaussian random orthogonal matrix U. 
Table [redisplays the mean and standard deviation of the number of iterations and the error bounds 
in these tests, for n = 50,100,150, 200, 250 and r = 8,12,16. 

In order to estimate the number of iterations required in our algorithms, we periodically estimated 
the numerical rank of the associated matrix in every k successive iterations, for fc = 5 in most of our 
experiments. 
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Table 4.1: Number of Iterations and Error Bounds for Algorithm 13 .1 1 on Random Polynomials 


n 

r 

Iteration-mean 

Iteration-std 

Error-mean 

Error-std 

50 

8 

7.44 

1.12 

4.18 X lO-*^ 

1.11 X lO"'^ 

100 

8 

8.76 

1.30 

5.90 X lO-*^ 

1.47 X lO"'^ 

150 

8 

9.12 

0.88 

2.61 X 10“^ 

1.03 X 10-'‘ 

200 

8 

9.64 

0.86 

1.48 X 10-“ 

5.93 X 10-“ 

250 

8 

9.96 

0.73 

1.09 X 10“^ 

5.23 X lO"'^ 

50 

12 

7.16 

0.85 

3.45 X 10-4 

9.20 X 10-4 

100 

12 

8.64 

1.15 

1.34 X 10“^ 

2.67 X lO”'^ 

150 

12 

9.12 

2.39 

3.38 X 10-^ 

1.08 X 10“^ 

200 

12 

9.76 

2.52 

6.89 X lO-*^ 

1.75 X lO-'^ 

250 

12 

10.04 

1.17 

1.89 X lO”'^ 

4.04 X lO”'^ 

50 

16 

7.28 

5.06 

3.67 X 10-^ 

7.62 X 10“^ 

100 

16 

10.20 

5.82 

1.44 X 10“^ 

4.51 X 10“^ 

150 

16 

15.24 

6.33 

1.25 X 10-^ 

4.90 X 10"^ 

200 

16 

13.36 

5.38 

1.07 X 10“^ 

4.72 X 10“^ 

250 

16 

13.46 

6.23 

1.16 X 10-4 

2.45 X 10“^ 


Table 4.2: Number of Iterations and Error Bounds for Algorithm 13 .1 1 on Random Matrices 


n 

r 

Iteration-mean 

Iteration-std 

Error- 

mean 

Error-std 

50 

8 

10.02 

1.83 

5.51 

X 

10-^^ 

1.65 X 10”!^ 

100 

8 

10.81 

2.04 

1.71 

X 

10-1^ 

5.24 X 10“^^ 

150 

8 

14.02 

2.45 

1.31 

X 

10-1^ 

3.96 X 10-1^ 

200 

8 

12.07 

0.94 

2.12 

X 

T(pn- 

6.70 X 10-^^ 

250 

8 

13.59 

1.27 

2.75 

X 

lo-^u 

8.14 X 10-1^ 

50 

12 

10.46 

1.26 

1.02 

X 

10-1^ 

2.61 X 10-1^ 

100 

12 

10.60 

1.51 

1.79 

X 

lo-^'J 

3.66 X lO-i*' 

150 

12 

11.25 

1.32 

5.69 

X 

io-« 

1.80 X 10-'^ 

200 

12 

12.36 

1.89 

7.91 

X 

lo-^u 

2.50 X 10-y 

250 

12 

11.72 

1.49 

2.53 

X 

10-1^ 

3.84 X 10-1^ 

50 

16 

10.10 

1.45 

1.86 

X 

lo-y 

5.77 X 10-y 

100 

16 

11.39 

1.70 

1.37 

X 

■YipTD- 

2.39 X lO-^'^ 

150 

16 

11.62 

1.78 

1.49 

X 

T(pn- 

4.580 X 10-^1 

200 

16 

11.88 

1.32 

1.04 

X 

10"^^ 

2.09 X 10“^^ 

250 

16 

12.54 

1.51 

3.41 

X 

T(pn- 

1.08 X lO-i'* 
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4.2 Tests for the Stabilized Matrix Sign Iterations (Algorithm 13.3|) 
Applied to Polynomials 

We tested Algorithm 13.31 on various modified benchmark polynomials from the website of MPSolve 
(http://numpi.dm.unipi.it/mpsolve-2.2/). With the exception of the polynomials of Type IV below, 
we tested benchmark polynomials that had only trivial real roots 0 and ±1, and we multiplied them 
by Chebyshev polynomials of degree r, for r = 8,12, and 16, which have only real roots. 

Having generated such a polynomial p = p{x) and its companion matrix C/, we computed the 
condition numbers of the matrices Nk = Cp + 2^+^ with k = 1,2,... and selected an integer k 
such that K{Nk) < 10^. Clearly, this is ensured for sufficiently large integers k defining diagonally 
dominant matrices Nk, but in our tests k was less than five in most cases. 

Having fixed k and Nk and following the description of Algorithm 13.31 we computed at first the 
matrices Yi = a/„ +Nk and Y 2 = aln — Nk, tor a = 0.0001\/^, and then successively the matrices 
V+ij = ^(Yij — Y-~^) with Yqj = Yj, for j = 1,2 (cf. Algorithms 13.II and 13.31) . 

We have observed that with our real shifts by 2^+^/ji at the initial stage, non-real eigenvalues of 
Yi and Y 2 were never close to ±-\/—1 at the first 7 + k iterations. So we began checking convergence 
only when we have performed these initial iterations, and since that moment we checked convergence 
in every five iterations. As soon as we observed that nrank(y/) = r, for 1/' = -f 1^,2 and for r 
denoting the number of distinct real roots of p{x), r = 8, 12,16, we stopped the iterations and moved 
to the final stage of the algorithm, that is, approximated the real eigenvalues of matrix Cp, equal to 
the real roots of the polynomial p(x). 

We have run numerical tests on polynomials of five types having degree n = 64,128, 256, 512,1024, 
and we compared our results with the outputs of MATLAB function ”roots()”: 

I. p(x) = pi(x)p 2 (x), where pi(x) is the r-th degree Chebyshev polynomial and P 2 (x) = x"”’’ — 1. 

H. p(x) = pi(x)p 2 (x), where pi{x) is the r-th degree Chebyshev polynomial and P 2 {x) = 1 -f 
2x + 3x^ -I- • • • -I- (n — r -I- l)x"'“’’. 

HI. p{x) = pi(x)p 2 (x), where pi(x) is the r-th degree Chebyshev polynomial and P 2 (x) = (x -f 
l)(x -I- a)(x -I- a^) ■ - -{x + with a = 

IV. p{x) = x"“’' — (ax — 1)^, where a = 60, 80,100. 

V. p(x) = pi{x)p 2 {x), where pi{x) is the r-th degree Chebyshev polynomial and P 2 {x) = 
Sfe=o , with oo,..., a„ being i.i.d. standard random variables. 

Tables l4l^4.6l display the number of iterations and the maximum error bounds, for the polyno¬ 
mials of Types YIV (cf. our Remark l3.ip . Table HTTl shows the average error bounds and the average 
numbers of iterations in 50 tests with the polynomials of Type V. 
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Table 4.3: Number of Iterations and Error Bounds for Algorithm 13.31 on Type I Polynomials 


n 

r 

Iterations 

Errors 

64 

8 

10 

1.03A- 10 

64 

12 

23 

1.32A- 08 

64 

16 

23 

3.97A- 06 

128 

8 

10 

1.60A- 10 

128 

12 

23 

4.91A- 04 

128 

16 

23 

2.22A- 03 

256 

8 

10 

6.18A- 06 

256 

12 

28 

1.75A- 09 

256 

16 

28 

3.54A- 06 

512 

8 

15 

8.05A- 13 

512 

12 

28 

1.71A- 08 

512 

16 

28 

2.78A- 05 

1024 

8 

15 

2.33A- 12 

1024 

12 

28 

1.27A- 09 

1024 

16 

28 

2.19A- 05 


Table 4.4: Number of Iterations and Error Bounds for Algorithm 13.31 on Type II Polynomials 


n 

r 

Iterations 

Errors 

64 

8 

10 

1.53E- 11 

64 

12 

23 

1.30E- 07 

64 

16 

23 

1.40E- 05 

128 

8 

28 

9.42E- 11 

128 

12 

10 

7.51E- 08 

128 

16 

28 

2.27E- 04 

256 

8 

28 

1.92E- 11 

256 

12 

28 

2.21E- 07 

256 

16 

28 

1.69E- 03 

512 

8 

28 

3.68E- 12 

512 

12 

28 

2.17E- 06 

512 

16 

33 

1.53E- 02 

1024 

8 

28 

2.96E- 11 

1024 

12 

33 

5.00E- 07 

1024 

16 

33 

3.58E- 03 
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Table 4.5: Number of Iterations and Error Bounds for Algorithm 4.3 on Type III Polynomials 


n 

r 

Iterations 

Errors 

64 

8 

28 

4.63A- 11 

64 

12 

23 

1.69A- 07 

64 

16 

28 

7.36A- 06 

128 

8 

28 

3.83A- 12 

128 

12 

23 

1.45A- 08 

128 

16 

28 

1.68A- 05 

256 

8 

28 

1.58A- 12 

256 

12 

23 

1.02A- 04 

256 

16 

28 

6.50A- 04 

512 

8 

28 

7.69A- 13 

512 

12 

23 

5.00A- 09 

512 

16 

28 

8.60A- 06 

1024 

8 

28 

9.90A- 14 

1024 

12 

23 

1.45A- 09 

1024 

16 

28 

2.64A- 05 


Table 4.6: Number of Iterations and Error Bounds Algorithm 4.3 on Type IV Polynomials 


n 

a 

Iterations 

Errors 

64 

60 

41 

2.43E - 04 

64 

80 

42 

7.98E - 04 

64 

100 

43 

1.72E-05 

128 

60 

41 

1.12E-03 

128 

80 

42 

4.43E - 04 

128 

100 

43 

1.31E-04 

256 

60 

41 

2.10£;-04 

256 

80 

42 

1.91£;-04 

256 

100 

43 

1.34E - 04 

512 

60 

41 

3.37E - 04 

512 

80 

42 

1.80E - 04 

512 

100 

43 

8.33E-05 

1024 

60 

36 

I.IOE-Ol 

1024 

80 

42 

1.16£;-04 

1024 

100 

43 

1.76£;-04 
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Table 4.7: Number of Iterations and Error Bounds for Algorithm 4.3 on Type V Polynomials 


n 

r 

Iterations 

Errors 

128 

8 

22.3 

5.33E- 06 

128 

12 

24.6 

4.85E- 05 

128 

16 

24.94 

3.59E- 03 

256 

8 

26.02 

l.llE- 06 

256 

12 

27.01 

2.37E- 05 

256 

16 

30.18 

1.80E- 03 

512 

8 

27.54 

2.73E- 08 

512 

12 

28.00 

2.27E- 06 

512 

16 

38.18 

2.39E- 03 
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4.3 Tests for the Stabilized Matrix Sign Iterations (Algorithm 13.3|) on 
Gaussian Random Matrices 

We tested Algorithm 13.31 on randomly generated matrices of two types: 

Type I: Gaussian random tridiagonal matrices of dimension n = 64,128, 256, 512,1024. We 
generated each entry in the tridiagonal part independently by using standard Gaussian distribution 
and set the other entries to 0. Our tables show the error bounds equal to the maximal difference 
of the outputs of our algorithm and MATLAB function ”eig()”. We generated 100 matrices, for 
each n, and recorded the mean and standard deviation of the error bounds and of the numbers of 
iterations. 

Type II: Random matrices A with a fixed number of real eigenvalues. At first we generated a 
diagonal matrix E with r diagonal entries under the standard real Gaussian distribution and n — r 
diagonal entries under the standard complex Gaussian distribution, for n = 64,128, 256, 512,1024 
and r = 8,12,16. Then we generated a standard Gaussian orthogonal matrix Q. Finally we com¬ 
puted the matrices A = Q^'EQ. We generated 100 such matrices A, for each pair of n and r, and 
recorded the mean and standard deviation of the error bounds and of the numbers of iterations. 

The following two tables summarize the performance data, showing a low number of iterations 
required for ensuring the approximation of the eigenvalues with a reasonable precision. 


Table 4.8: Number of Iterations and Error Bounds for Root-finding Algorithm l3. 3l on Type 


I matrices 


n 

Iteration-mean 

Iteration-std 

Error-mean 

Error-std 

64 

10.70 

2.36 

1.78E- 06 

1.14E- 05 

128 

12.16 

3.34 

5.68A- 07 

4.49E - 06 

256 

12,97 

3.97 

3.26E;- 06 

1.35E- 05 

512 

15.46 

9.82 

8.80E- 04 

8.44E - 03 

1024 

16.52 

10.26 

2.43E- 03 

2.25E- 02 


Table 4.9: Number of Iterations and Error Bounds for Algorithm 13.31 on Type II matrices 


n 

r 

Iteration-mean 

Iteration-std 

Error-mean 

Error-std 

64 

8 

11.65 

2.47 

3.69E - 08 

2.29E;- 07 

64 

12 

11.75 

2.50 

3.98E- 10 

2.71E- 09 

64 

16 

11.60 

2.45 

4.10E-09 

3.88E - 08 

128 

8 

13.75 

2.79 

1.17E-08 

7.56E- 08 

128 

12 

13.70 

2.90 

4.41E-09 

2.73E - 08 

128 

16 

13.65 

2.55 

1.23E-07 

1.34E- 06 

256 

8 

14.55 

3.26 

5.59E-09 

5.58E- 08 

256 

12 

14.15 

3.70 

1.38E-07 

1.38E- 06 

256 

16 

14.70 

2.54 

3.06E- 11 

1.93E- 10 

512 

8 

13.65 

5.59 

5.08E- 10 

4.88E - 09 

512 

12 

15.65 

9.47 

7.46E; - 04 

7.46E - 03 

512 

16 

16.55 

10.26 

2.78E - 03 

5A7E - 03 

1024 

8 

18.20 

15.35 

2.33E- 10 

1.22E- 09 

1024 

12 

20.85 

17.60 

1.27E-07 

3.36E - 07 

1024 

16 

24.35 

19.56 

2.19E;-03 

4.33E - 03 


16 


































4.4 Tests for a Hybrid Matrix Algorithm on Benchmark Polynomials 

We performed numerical tests of a hybrid algorithm. We began with Algorithm 13.11 and after 
sufficiently many iterations continued with its variation avoiding matrix inversion. 

Namely, we first applied a real shift /3J to the companion matrix Cp, such that the matrix 
M = Cp + f3I had condition number less than 10^. Based on our preliminary tests, we expected 
that, for such inputs, at least T = log 2 /3 iterations M^+i = \{Mi — M~^) would be required in order 
to move the complex nonreal eigenvalues close enough to i-y/—1. After the first T iterations, we 
periodically (in every 5 iterations) applied two iterations + 3Mi), which converged 

with cubic rate provided that all complex eigenvalues have distance less than i from or 
Before switching to the iterations of the second type, we performed the following transformation in 
order to avoid problems of numerical stability: 

Step 1: Compute P = which maps the real line into the unit circle. 

Step 2: Compute Y = {P — P~^), mapping the unit circle onto the interval [—2/3, 2/3]. 

Note that these two maps together keep the values ±-\/^ unmoved. 

We tested polynomials of Types II and IV of the previous section. For polynomials of Types I, 
III, and V, the test results were similar to those for polynomials of Type II, apparently due to the 
shared Chebyshev factors. The test results on Type IV polynomials indicate the strength of this 
algorithm in the case of clustered roots. 

The number of iterations required and the error bound are displayed in the tables below. 


Table 4.10: Number of Iterations and Error Bounds for Hybrid Algorithm on Type II Polynomials 


n 

r 

Iterations 

Errors 

64 

8 

10 

3.69A- 10 

64 

12 

23 

4.96A- 08 

64 

16 

23 

4.97A- 03 

128 

8 

10 

2.28A- 11 

128 

12 

28 

1.97A- 07 

128 

16 

23 

8.68A- 02 

256 

8 

28 

6.56A- 12 

256 

12 

28 

3.64A- 07 

256 

16 

28 

3.82F;- 04 

512 

8 

15 

8.05A- 13 

512 

12 

28 

1.71A- 08 

512 

16 

28 

2.78A- 05 

1024 

8 

28 

3.72F;- 11 

1024 

12 

28 

1.09A- 08 

1024 

16 

33 

2.19A- 05 


Table 4.11: Number of Iterations and Error Bounds for Hybrid Algorithm on Type IV Polynomials 


n 

Iterations 

Errors 

64 

33 

7.32E - 05 

128 

33 

6.12E-06 

256 

38 

1.60E-05 

512 

38 

1.08E - 04 

1024 

38 

9.19E-01 
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4.5 Tests for the Modular Square Root Iterations (Algorithm 13.4|) 

Table 0321 displays our test results for Algorithm [TH that is, for the iterations fi+i{x) = — 

fi{x)~^) modp(a;), which computed polynomial inverses modulo p(x) by solving the associated 
Sylvester linear systems of equations. We applied the tests to polynomials of Types I and If. 

Already after a small number of iterations, that is, for small integers i, the tests have consistently 
produced polynomials fi(x) whose roots approximated the complex roots of the polynomial p{x) of 
(dD within the fixed tolerance bound e = 10 At this stage of our preliminary tests we applied 
the MATLAB function ”roots()” in order to avoid actual computation of agcds. Namely, as soon 
as we observed that the polynomial p(x) shared all its complex roots with the polynomial fi{x), we 
stopped the iterations and computed the approximate quotient [p(a:)/agcd(p, fi)], which had degree 
r and had r real roots, all shared with p{x). 


Table 4.12: Number of Iterations for Algorithm 13.41 on Polynomials of Types I and II 


n 

r 

Type I 

Type II 

64 

8 

9 

14 

64 

12 

4 

16 

64 

16 

2 

17 

128 

8 

9 

14 

128 

12 

12 

16 

128 

16 

2 

17 

256 

8 

9 

14 

256 

12 

12 

16 

256 

16 

8 

17 

512 

8 

9 

14 

512 

12 

12 

16 

512 

16 

8 

17 

1024 

8 

10 

14 

1024 

12 

12 

16 

1024 

16 

11 

17 
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Appendix 

A Some Maps of the Variables and the Roots 

Some basic maps of the roots of a polynomial can be computed at a linear or nearly linear arithmetic 
cost. 

Theorem A.l. (Root Inversion, Shift and Scaling, cf. \29^.) 

(i) Given a polynomial p(x) of il.ll) and two scalars a and b, one can compute the coefficients of 
the polynomial q{x) = p{ax J- b) by using 0(n log(n)) arithmetic operations. This bound decreases 
to 2n — \ multiplications if b — 0. 

(a) Reversing apolynomial inverts all its roots by involving no flops, that is, Prev{x) = x"^p(\/x) = 

J27=oPi^''~' = Pn nr=i(^ “ 
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Note that by shifting and scaling the variable, we can move all roots of p(x) into a fixed disc, 
e.g., D{0,1) = {x : |a;| < 1}. 

Theorem A.2. (iDandelin’s Root Squaring, cf. f^7]/. ) 

(i) Let a polynomial p{x) of U.l\) be monic. Then q(x) = {—\)'^p{y/x )p{—^/x ) = ~ ^j)- 

(ii) One can evaluate p{x) at the k-th roots of unity, for k > 2n, and then interpolate to q{x) by 
using 0(fclog(fc)) f arithmetic operations overall. 

Remark A.l. Recursive root-squaring is prone to numerical stability problems because the coef¬ 
ficients of the iterated polynomials very quickly span many orders of magnitude. It is somewhat 
surprising, but the Boolean complexity of the recursive root-squaring process is relatively reasonable 
if high output precision is required \27}/ , \S^ . Moreover, one can avoid numerical stability problems 
and perform all iterations with double precision by applying a special tangential representation of the 
coefficients and of the intermediate results proposed in f23j/ . In this case the computations involve 
more general operations than flops, but in terms of the CPU time the computational cost per iteration 
has the same order as flops, performed with double precision. 

Theorem A.3. (The Cayley Maps.J 

(i) The map y = {x — a\/—l )/{x + a^—1 ), for any real nonzero scalar a, sends the real axis 
{x : X is real} onto the unit circle (7(0, 1) = {y : \y\ = 1}. 

(ii) The converse map x = a^f—l (1 — y)/{y-\- 1) sends the unit circle (7(0,1) onto the real axis. 


B Some Functional Iterations for Polynomial Root-finding 


Newton’s and Ehrlich-Aberth’s are two celebrated functional iteration processes for the approxima¬ 
tion of a single root of a polynomial p{x) of (11.11) and all its roots, respectively. They are highly 
efficient and popular, but not specialized to our task of approximating real roots, and we only use 
them as auxiliary root-refiners. 

Hereafter a disc D{X, r) is said to be 'y-isolated for a polynomial p{x) and 7 > 1 if it contains all 
roots of the polynomial lying in the disc D{X, yr). In this case we say that the disc has the isolation 
ratio at least 7 . 

Newton’s iterations refine an approximation y^^^ to a single root of a polynomial p{x) of (II.1[) . 

yo = c, = y^^^ - p{y^^^) (pfly^^^), h = Q,l,... (B.l) 


Ehrlich-Aberth’s iterations refine n simultaneous approximations , ■. ■, to all n roots 
xi,... ,Xn of such a polynomial. 




= 4^^ - l/eP\ for ef’ =p(zfVp(4^’) “ (h) ^ (h) ’ * = ■ T, 


(B.2) 




- 2, 


See [24], [25] for various other functional iterations. 

As we can see next, both iterative algorithms refine very fast the crude initial approximations to 
simple isolated roots of a polynomial. 

Theorem B.l. Assume a polynomial p = p(x) of hl.l\) and let 0 < 3(n — l)|yo — a^i| < \yo — xfl, 
for j = 2,... ,n. Then Newton’s iterations W. 1\) converge to the root xi quadratically right from the 
start, namely, \yk — xi| < 2 |j/o — a;i|/ 2 ^ , for k = 0,1,.... 

Proof. See |48l Theorem 2.4], which strengthens |43l Corollary 4.5]. □ 

Theorem B.2. (See Theorem 3.3].) Assume a polynomial p = p{x) of i B.lj) and crude initial 
approximations y^^^ to the roots Xj such that 0 < 3\/n — 1 ~ ^j\ < 14°^ ~ /o’" * 

j = l,...,n. Then Ehrlich-Aberth’s iterations converge to the roots xj with the cubic rate right 
from the start, namely, — Xj\ < 12 /]°^ ~ 2 ;^|/( 2 ^ •^/(n — 1) ), for j = 1,... ,n and k = 0,1,.... 
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The paper [35] also proves quadratic convergence of the WDK iterations to all n roots, lying in 
some given discs with an isolation ratios at least 3(n — l)/8. These iterations are due to Weierstrass 
ED, but are frequently attributed to its later re-discoveries by Durand in 1960 and Kerner in 1966. 

By exploiting the correlations between the coefficients of a polynomial and the power sums of its 
roots, the paper [40] had weakened the above assumptions on the initial isolation. More precisely, 
assuming that a simple root lies in the disc D{Q, 1) and that the disc has an isolation ratio at least 
s > 1 -I- l/log 2 (n), the paper [40] increases it to cn‘^, for any fixed pair of constants c and d, at 
the arithmetic cost 0{n), and similarly increases the isolation ratio of the n discs covering all the n 
roots at the arithmetic cost 0(nlog^(n)). 

In the case of a single disc, one can assume even an isolation ratio s > 1 -I- , for any pair of 

constants c' and d', and then increase it to s > cn'^, for any other pair of constants c and d, at the 
arithmetic cost 0{n log^(n)). Indeed one can achieve this by performing h root-squaring iterations of 
Theorem lA.2[ for h of order log(n) because each squaring of the roots also squares the isolation ratio. 
This lifting process ensures the desired isolation for the lifted roots of the new lifted polynomial, 
but the descending back to the original roots can be also achieved by using 0{nlog^{n)) arithmetic 
operations [22], m- We refer the reader to Remark lA.il on the precision growth in these iterations 
and their Boolean complexity. 

Can we completely relax the assumption of the initial isolation? Empirically fast global con¬ 
vergence (that is, convergence right from the start) is very strong over all inputs for the WDK, 
Ehrlich-Aberth, and some other iterations that approximate simultaneously all n roots of a poly¬ 
nomial p{x) of (11.11) . The papers [34], [41], and [35] have challenged the researchers to support this 
observation with a formal proof, which is still missing, however. 


C Fast Root-finding Where All Roots Are Real 

Theorem C.l. Assume that all roots of a polynomial p{x) of are real. 

(i) Then the modified Laguerre algorithm of m converges to all of them right from the start, 
uses 0(n) flops per iteration, and therefore approximates all the n roots within e = Xflf by using 
0(log(5)) iterations and performing 0(nlog{b)) flops. 

(ii) The latter asymptotic arithmetic cost bound is optimal and is supported by the alternative 
algorithms of m and as well. 

(Hi) All these algorithms reach the optimal Boolean cost bound up to polylogarithmic factors. 


D Counting the Roots in a Disc. Root Radii, Distances to 
the Roots, and the Proximity Tests 

In this subsection we estimate the distances to the roots of p{x) from a complex point and the 
number of the roots in an isolated disc. 

The latter task can be solved by using the following result from [43[ Lemma 7.1] (cf. also [44l 
Theorem 14.1] and i)- 

Theorem D.l. \43\ Lemma 7.1] It is sufficient to perform FFT at n' = 16|"log2n] points (using 
1.5n'log(n') flops) and 0(n) additional flops and comparisons of real numbers with 0 in order to 
compute the number of roots of a polynomial p{x) of U.l\) in a 9-isolated disc D{0, r). 

Remark D.l. The algorithm of 143^ supporting Theorem ID. II only uses the signs of the real and 
imaginary parts of the n output values of FFT. For some groups of the values, the pairs of the signs 
stay invariant and can be represented by a single pair of signs. Can this observation be exploited in 
order to decrease the computational cost of performing the algorithm? 

Corollary D.l. It is sufficient to perform 0{hnlog(n)) flops and 0{n) comparisons of real numbers 
with 0 in order to compute the number of roots of a polynomial p{x) of in an s-isolated disc 
D{0, 1), for s = 9^/^ and any positive integer h. 
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Proof. Every root-squaring of Theorem I A. 2 1 so uares all root-radii and the isolation ratios of all discs 
£)(0, r), for all positive r. Suppose h repeated squaring iterations map a polynomial p{x) into Phix), 
for which the disc D{0, 1) is 9-isolated. Then, by applying Theorem lD.il we can compute the number 
of roots of Phix) in this disc, equal to the number of roots of pix). □ 

In view of Remark lA.ll one must apply the slower operations of [23] or high precision computa¬ 
tions in order to support even a moderately long sequence of root-squaring iterations, but in some 
cases it is sufficient to apply Corollarv lD.il for small positive integers h. Note that 9^/^ is equal to 
1.3160..., for h = 2, to 1.1472..., for h = 3, to 1.0710..., for h = A, and to 1.0349..., for h = 5. 

We can use the following result if we agree to perform computations with extended precision. 

Theorem D.2. (The Root Radii Approximation.J 

Assume a polynomial pix) of and two real scalars c > 0 and d. Define the n root radii 

Tj = \xkj \, for j = 1,..., n, distinct ki,..., and ri > r 2 > ■ ■ ■ > rn- Then, by using 0(n log^(n)) 
arithmetic operations, one can compute n approximations fj to the root radii rj such that fj < Vj < 
il +c/n'^)fj, for j = l,...,n. 

Proof. (Cf. [33], [3H1 Section 4].) At first fix a sufficiently large integer k and apply k times the 
root-squaring of Theorem 1A.21 which involves 0(fcnlog(n)) arithmetic operations. Then apply the 
algorithm of |44) (which uses 0(n) arithmetic operations) in order to approximate within a factor 
of 2n all root radii , j = 1,..., n, of the output polynomial Pkix). By taking the 2^-th 

roots, approximate the root radii ri,... ,r„ within a factor of ( 2 n)^/^ , which is 1 -f c/n'^, for k of 
order log(n). □ 

Alternatively we can approximate the root radii by applying the semi-heuristic method of | 2 ], 
used in the packages MPSolve 2000 and 2012 (cf. |5] and [1^) or by recursively applying Theorem 
ID.11 although neither of these techniques support competitive complexity estimates. 

The following two theorems define the largest root radius ri of the polynomial pix). 

Theorem D.3. (See fj^.) Assume a polynomial pix) of U.l\) . Write ri = max"^^ \xj\, = 

min"^j^ \xj\, and 7 + = max'hj^ \pn-i/pn\- Then < ri < 27 +. 

Theorem D.4. (See } For e = 1/2^ > 0, one only needs a(n, e) = 0(n -|- 61og(6)) flops to 
compute an approximation ri^^ to the largest root radius ri of pix) such thatri,, < ri < 5(1-I-e)ri_e. 
In particular, a(n,e) = 0(n), for b = 0(n/log(n)), and a(n, e) = 0(nlog(n)), for b = 0(n). 

Both theorems can be immediately extended to the approximation of the smallest root radius 
because it is the reciprocal of the largest root radius of the reverse polynomial Prewix) = x'^pil/x) 
(cf. Theorem lA.il) . Moreover, by shifting a complex point c into the origin (cf. Theorem lA.ll) . we 
can turn our estimates for the root radii into the estimates for the distances to the roots from the 
point c. Approximation of the smallest distance from a complex point c to a root of pix) is called 
the proximity test at the point. One can perform such a test by applying Theorems lD.il ID.31 or lD.4l 
Alternatively, for heuristic proximity tests by action at a point c or at n points, one can apply 
Newton’s iterations (IB. II) or an appropriate functional iterations, such as the Ehrlich-Aberth iter¬ 
ations (|B.2|) . and estimate the distance to the roots by observing convergence or divergence of the 
iterations. 

Theorem lD.4l and all these iterations, including Newton’s, Ehrlich-Aberth’s and WDK’s, can be 
applied even where a polynomial pix) is defined by a black box subroutine for its evaluation rather 
than by its coefficients. 
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